PageRank: Three Distributed Algorithms
نویسنده
چکیده
This paper shows for the first time that there are multiple PageRank definitions, and that more is required to justify PageRank than is offered in the literature ([1], [2] and [5]). Adopting a formal approach, this paper provides a justification of PageRank. It also shows that the irreducibility restriction on the PageRank transition matrix [5] is unnecessary. This is important because it gives the personalisation vector more intuitive force. Noting the difficulties in calculating PageRank centrally, the paper then shows, via the ChazanMiranker theorem [10], that distributed calculation algorithms are possible; and, it presents three such novel algorithms. The first of these takes documents / pages as being of principal interest; the second and third, in an attempt to reduce communication overheads, focus attention on nodes/ webservers. Empirical test results are shown. These confirm reduced message numbers for algorithms 2 and 3. They also show that more investigation is required into suitable -thresholds within asynchronous environs.
منابع مشابه
Fast Distributed PageRank Computation
Over the last decade, PageRank has gained importance in a wide range of applications and domains, ever since it first proved to be effective in determining node importance in large graphs (and was a pioneering idea behind Google’s search engine). In distributed computing alone, PageRank vectors, or more generally random walk based quantities have been used for several different applications ran...
متن کاملPersonalizing PageRank-Based Ranking over Distributed Collections
In distributed work environments, where users are sharing and searching resources, ensuring an appropriate ranking at remote peers is a key problem. While this issue has been investigated for federated libraries, where the exchange of collection specific information suffices to enable homogeneous TFxIDF rankings across the participating collections, no solutions are known for PageRank-based ran...
متن کاملSubgraph Rank: PageRank for Subgraph-Centric Distributed Graph Processing
The growth of Big Data has seen the increasing prevalence of interconnected graph datasets that reflect the variety and complexity of emerging data sources. Recent distributed graph processing platforms offer vertex-centric and subgraphcentric abstractions to compose and execute graph analytics on commodity clusters and Clouds. Näıve translation of existing graph algorithms to these programming...
متن کاملDistributed Page Ranking in Structured P2P Networks
This paper discusses the techniques of performing distributed page ranking on top of structured peer-to-peer networks. Distributed page ranking are needed because the size of the web grows at a remarkable speed and centralized page ranking is not scalable. Open System PageRank is presented in this paper based on the traditional PageRank used by Google. We then propose some distributed page rank...
متن کاملKnowing Where to Search: Personalized Search Strategies for Peers in P2P Networks
Optimizing and focusing search and results ranking in P2P networks becomes more and more important with the increasing size of these networks. Even though a few approaches have already started to investigate the computation of PageRank-like values in P2P environments, none so far has investigated how personalization could be added to it. This paper tackles the problem of distributedly computing...
متن کامل